Patterns of i.i.d. Sequences and Their Entropy - Part I: General Bounds

نویسنده

  • Gil I. Shamir
چکیده

Tight bounds on the block entropy of patterns of sequences generated by independent and identically distributed (i.i.d.) sources are derived. A pattern of a sequence is a sequence of integer indices with each index representing the order of first occurrence of the respective symbol in the original sequence. Since a pattern is the result of data processing on the original sequence, its entropy cannot be larger. Bounds derived here describe the pattern entropy as function of the original i.i.d. source entropy, the alphabet size, the symbol probabilities, and their arrangement in the probability space. Matching upper and lower bounds derived provide a useful tool for very accurate approximations of pattern block entropies for various distributions, and for assessing the decrease of the pattern entropy from that of the original i.i.d. sequence.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Patterns of i.i.d. Sequences and Their Entropy - Part II: Bounds for Some Distributions

A pattern of a sequence is a sequence of integer indices with each index describing the order of first occurrence of the respective symbol in the original sequence. In a recent paper, tight general bounds on the block entropy of patterns of sequences generated by independent and identically distributed (i.i.d.) sources were derived. In this paper, precise approximations are provided for the pat...

متن کامل

Some properties of the parametric relative operator entropy

The notion of entropy was introduced by Clausius in 1850, and some of the main steps towards the consolidation of the concept were taken by Boltzmann and Gibbs. Since then several extensions and reformulations have been developed in various disciplines with motivations and applications in different subjects, such as statistical mechanics, information theory, and dynamical systems. Fujii and Kam...

متن کامل

Stability Bounds for Stationary φ-mixing and β-mixing Processes

Most generalization bounds in learning theory are based on some measure of the complexity of the hypothesis class used, independently of any algorithm. In contrast, the notion of algorithmic stability can be used to derive tight generalization bounds that are tailored to specific learning algorithms by exploiting their particular properties. However, as in much of learning theory, existing stab...

متن کامل

Stability Bounds for Stationary phi-mixing and beta-mixing Processes

Most generalization bounds in learning theory are based on some measure of the complexity of the hypothesis class used, independently of any algorithm. In contrast, the notion of algorithmic stability can be used to derive tight generalization bounds that are tailored to specific learning algorithms by exploiting their particular properties. However, as in much of learning theory, existing stab...

متن کامل

High Fuzzy Utility Based Frequent Patterns Mining Approach for Mobile Web Services Sequences

Nowadays high fuzzy utility based pattern mining is an emerging topic in data mining. It refers to discover all patterns having a high utility meeting a user-specified minimum high utility threshold. It comprises extracting patterns which are highly accessed in mobile web service sequences. Different from the traditional fuzzy approach, high fuzzy utility mining considers not only counts of mob...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/cs/0605046  شماره 

صفحات  -

تاریخ انتشار 2006